Collocational Grammar
نویسنده
چکیده
ryone is fighting a fierce rearguard action struggling towards the objective of a discrete syntactic classes, all the while being forced by practical necessity to move closer and closer toward an analysis of relationships whenever they want to get anything done. Everyone is still trying to find relationships of classes rather than classes of relationships. In n-gram models relationships are used, but not classified, and enormous data requirements make their direct interpretation impractical. In Magerman's history based parsing he effects the classification of one half of the infinity of relationships he seeks to use, but the use of relationship classes is implicit rather than explicit. Schuetze derives relationship classes, but only within a lexical not a syntactic context 1. Generalizations are supposed to simplify problems, not create them. The lexical classes of traditional analyses are cognitive classes. They may properly be the objectives of our analyses, they need not be the means. In practice they seem to cause more problems than they solve. It is the thesis of this paper that this is because they are dependent on a more fundamental classification, that of relationships or collocational structure. I have sought to show that it is relationship classes which underlie the success of " data based " language models such as n-gram and statistical parsers, and that the most efficient way of modelling relationship classes is in terms of an analysis which factors out the greatest number of similarities in different token strings. Rather than trying to extrapolate from lexical generality to structural generality I feel 1. Others are similarly led to " relationship classes " as a means of resolving lexical ambiguity, e.g. Yarowsky (1993). The " supertags " of Joshi and Srinivas (1994) seem very close to relationship classes though their formulation is still strongly influenced by concepts of lexical generalization. we should be moving from structural generality to lexical and syntactic generality. We can still have our familiar syntax categories but only in the context of a sub-class of the wider collocational classification. The central issue of NLP becomes, not the efficient classification of parts of speech, but of col-locational regularity, the single most important tool in the analysis of language structure, an effective means of modelling similarities in strings.
منابع مشابه
A Correlational Study of Expectancy Grammar’s Manifestation on Cloze Test and Lexical Collocational Density
The notion of expectancy grammar as a key to understanding the nature of psychologically real processes that underlie language use is introduced by Oller (1979). A central issue in this notion is that expectancy generating systems are constructed and modified in the course of language acquisition. Thus, one of the characteristics of language proficiency is that it consists of such an expectancy...
متن کاملAn exploratory study of collocational use by ESL students – A task based approach
Collocation is an aspect of language generally considered arbitrary by nature and problematic to L2 learners who need collocational competence for effective communication. This study attempts, from the perspective of L2 learners, to have a deeper understanding of collocational use and some of the problems involved, by adopting a task based approach, using two highly comparable corpora based on ...
متن کاملThe Generation of Idiomatic and Collocational Expressions
Collocations whose semantic content is not or only partially composed from the semantic content of their parts are often viewed as problematic for generation. In this paper a tactical generator combining FUF as the generation engine and HPSG as the grammar framework is presented. It is shown, that the lexicon driven approach to syntactic and semantic processing is well-suited for the generation...
متن کاملBraucht niemanden zu scheren: A Survey of NPI Licensing in German
In this contribution we will argue that negative polarity is a collocational phenomenon that does not follow from other properties of the respective lexical elements. With German data as evidence, we will follow a proposal by van der Wouden and treat Negative Polarity Items (NPIs) as collocates which must be licensed by abstract semantic properties of their contexts. Using a collocation module ...
متن کاملEvaluating a German Sketch Grammar: A Case Study on Noun Phrase Case
Word sketches are part of the Sketch Engine corpus query system. They represent automatic, corpus-derived summaries of the words’ grammatical and collocational behaviour. Besides the corpus itself, word sketches require a sketch grammar, a regular expression-based shallow grammar over the part-of-speech tags, to extract evidence for the properties of the targeted words from the corpus. The pape...
متن کاملTowards a corpus-based dictionary of German noun-verb collocations
We 1 describe our attempts to automatically extract raw material for a dictionary of German noun-verb collocations from large corpora of newspaper text. Such a dictionary should be about collocations and it should include a description of their linguistic properties, rather than listing the mere lexical cooccurrence. Since most statistical collocation nding tools do not provide other than lexic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره cmp-lg/9604007 شماره
صفحات -
تاریخ انتشار 1996